Fast and Adaptive Online Training of Feature-Rich Translation Models
نویسندگان
چکیده
We present a fast and scalable online method for tuning statistical machine translation models with large feature sets. The standard tuning algorithm—MERT—only scales to tens of features. Recent discriminative algorithms that accommodate sparse features have produced smaller than expected translation quality gains in large systems. Our method, which is based on stochastic gradient descent with an adaptive learning rate, scales to millions of features and tuning sets with tens of thousands of sentences, while still converging after only a few epochs. Large-scale experiments on Arabic-English and Chinese-English show that our method produces significant translation quality gains by exploiting sparse features. Equally important is our analysis, which suggests techniques for mitigating overfitting and domain mismatch, and applies to other recent discriminative methods for machine translation.
منابع مشابه
Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کاملFeature-Rich Phrase-based Translation: Stanford University's Submission to the WMT 2013 Translation Task
We describe the Stanford University NLP Group submission to the 2013 Workshop on Statistical Machine Translation Shared Task. We demonstrate the effectiveness of a new adaptive, online tuning algorithm that scales to large feature and tuning sets. For both English-French and English-German, the algorithm produces feature-rich models that improve over a dense baseline and compare favorably to mo...
متن کاملReal-Time Output Feedback Neurolinearization
An adaptive input-output linearization method for general nonlinear systems is developed without using states of the system. Another key feature of this structure is the fact that, it does not need model of the system. In this scheme, neurolinearizer has few weights, so it is practical in adaptive situations. Online training of neuroline...
متن کاملNeuro-Fuzzy Based Algorithm for Online Dynamic Voltage Stability Status Prediction Using Wide-Area Phasor Measurements
In this paper, a novel neuro-fuzzy based method combined with a feature selection technique is proposed for online dynamic voltage stability status prediction of power system. This technique uses synchronized phasors measured by phasor measurement units (PMUs) in a wide-area measurement system. In order to minimize the number of neuro-fuzzy inputs, training time and complication of neuro-fuzzy ...
متن کاملAn Empirical Comparison of Features and Tuning for Phrase-based Machine Translation
Scalable discriminative training methods are now broadly available for estimating phrase-based, feature-rich translation models. However, the sparse feature sets typically appearing in research evaluations are less attractive than standard dense features such as language and translation model probabilities: they often overfit, do not generalize, or require complex and slow feature extractors. T...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013